The minimum word length in Li ’ s article

نویسندگان

  • Ramon Ferrer-i-Cancho
  • Brita Elvev̊ag
چکیده

Li [1] does not explicitly assume that the minimum word length that his random text model generates is one. The fact that he shows examples of random texts (p. 1842 of [1]) with sequences of more than one blank in a row at the beginning of his article can be confusing because there is evidence later in the same article that the author is implicitly assuming that words have a minimum length of one. In Eqs. 3 and 15 of [1], summations are restricted to lengths greater or equal than one. Further evidence of the absence of empty words in the simulations comes from Fig. 1 of [1]. There, the plots of the rank spectrum for different alphabet sizes of a random text with equal character probabilities start with plateaus of the number of characters in the alphabet (excluding the space), confirming the absence of empty words. However, the manner in which the parameters of the simulations with unequal character probabilities are presented is confusing. We assume that we have N characters other than space and p1, ..., pi, ..., pN are the probabilities of each these characters in Li’s model and pb is the probability of blank. The presentation by Li of these probabilities allows one to interpret although incorrectly from what we have noted above that the probability of a blank does not depend upon the number of characters that have already been placed for the current word. In contrast, if the current word does not have any characters the probability of a blank is actually zero and the probabilities for other characters are no longer valid. For this reason, it would be more accurate to state that pb, p1, ..., pN are the character probabilities when the word being constructed has more than one character. If not, then

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AIOSC: Analytical Integer Word-length Optimization based on System Characteristics for Recursive Fixed-point LTI Systems

The integer word-length optimization known as range analysis (RA) of the fixed-point designs is a challenging problem in high level synthesis and optimization of linear-time-invariant (LTI) systems. The analysis has significant effects on the resource usage, accuracy and efficiency of the final implementation, as well as the optimization time. Conventional methods in recursive LTI systems suffe...

متن کامل

Bertrand’s Paradox Revisited: More Lessons about that Ambiguous Word, Random

The Bertrand paradox question is: “Consider a unit-radius circle for which the length of a side of an inscribed equilateral triangle equals 3 . Determine the probability that the length of a ‘random’ chord of a unit-radius circle has length greater than 3 .” Bertrand derived three different ‘correct’ answers, the correctness depending on interpretation of the word, random. Here we employ geomet...

متن کامل

Word Clustering and Disambiguat ion Based on Co-occurrence Data

We address the problem of clustering words (or constructing a thesaurus) based on co-occurrence data, and using the acquired word classes to improve the accuracy of syntactic disambiguation. We view this problem as that of estimating a joint probability distribution specifying the joint probabilities of word pairs, such as noun verb pairs. We propose an efficient algorithm based on the Minimum ...

متن کامل

The Biology reproduction of Rocha (Rutilus rutilus caspicus ) in the waters of Gomishan and Assurade Islands (South east of the Caspian Sea)

  This research was carried out with the aim of investigating the biological reproduction of fish retrieval in the Caspian Sea. Each Month catches were carried from Golestan province from December 2015 to September 2016. In total, 200 pieces of fish were caught to study reproductive production in the study area. In this research, the total length of the fish was 16.6 ± 1.5 mm and the frequency ...

متن کامل

Words Guaranteeing Minimum Image

Given a positive integer n and a finite alphabet Σ, a word w over Σ is said to guarantee minimum image if, for every homomorphism φ from the free monoid Σ∗ over Σ into the monoid of all transformations of an n-element set, the range of the transformation wφ has the minimum cardinality among the ranges of all transformations of the form vφ where v runs over Σ∗ . Although the existence of words g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010